Optical model with polarization direction effects for comparison to measured spectrum

ABSTRACT

A method of controlling a polishing operation includes storing an optical model for a layer stack having a plurality of layers. The optical model has a plurality of input parameters, the plurality of input parameters including a first parameter and a second parameter. The second parameter is a polarization angle or a relative contribution between two orthogonal polarizations. A spectrum reflected from the substrate is measured with an in-sequence or in-situ monitoring system to provide a measured spectrum. The optical model is fit to the measured spectrum, or a plurality of reference spectra are calculated using the optical model and a best matching reference spectrum from the plurality of reference spectra is determined.

TECHNICAL FIELD

The present disclosure relates to polishing control methods, e.g., during chemical mechanical polishing of substrates.

BACKGROUND

An integrated circuit is typically formed on a substrate by the sequential deposition of conductive, semiconductive, or insulative layers on a silicon wafer. A variety of fabrication processes require planarization of a layer on the substrate. For example, for certain applications, e.g., polishing of a metal layer to form vias, plugs, and lines in the trenches of a patterned layer, an overlying layer is planarized until the top surface of a patterned layer is exposed. In other applications, e.g., planarization of a dielectric layer for photolithography, an overlying layer is polished until a desired thickness remains over the underlying layer.

Chemical mechanical polishing (CMP) is one accepted method of planarization. This planarization method typically requires that the substrate be mounted on a carrier head. The exposed surface of the substrate is typically placed against a rotating polishing pad. The carrier head provides a controllable load on the substrate to push it against the polishing pad. A polishing liquid, such as slurry with abrasive particles, is typically supplied to the surface of the polishing pad.

One problem in CMP is determining whether the polishing process is complete, i.e., whether a substrate layer has been planarized to a desired flatness or thickness, or when a desired amount of material has been removed. Variations in the initial thickness of the substrate layer, the slurry composition, the polishing pad condition, the relative speed between the polishing pad and the substrate, and the load on the substrate can cause variations in the material removal rate. These variations cause variations in the time needed to reach the polishing endpoint. Therefore, it may not be possible to determine the polishing endpoint merely as a function of polishing time.

In some systems, a substrate is optically monitored in-situ during polishing, e.g., through a window in the polishing pad. In some optical monitoring processes, a spectrum is measured in-situ, i.e., during a polishing process of CMP. However, existing optical monitoring techniques may not satisfy increasing demands of semiconductor device manufacturers.

SUMMARY

One approach to deriving endpoint data from a spectrum measured in-situ during polishing is to fit a function, e.g., an optical model, to the measured spectrum. The optical model is a function with multiple parameters, e.g. the thickness, index of refraction and extinction coefficient of each layer in the stack. The optical model generates an output spectrum based on the parameters. By fitting the optical model to the measured spectrum, the parameters are selected, e.g., by regression techniques, to provide an output spectrum that closely matches the measured spectrum.

Another approach to use the optical model to generate a library with a plurality of reference spectra. The measured spectrum is compared to the reference spectra in the library, and the best matching reference spectrum is identified. Parameters that generated the reference spectrum can be identified.

In some situations, the light that illuminates the substrate may be partially polarized, e.g., due to interaction with optical components between the light source and the substrate. Even if the monitoring system is ostensibly designed to illuminate the substrate with unpolarized light, some polarization effects can occur. Thus, the polarization may be unintentional (in which case the light will likely be partially polarized), or intentional (in which case the light will likely be completely polarized).

A device wafer is typically patterned. This pattern can generate diffraction effects, which can depend on the polarization of the light. However, since the orientation of the substrate may not be known, and the polarization may be unintentional, an optical model which assumes a known polarization (or assumes unpolarized light) may not be accurate, and consequently the endpoint determination may be unreliable. A technique is for the model to include the polarization angle as an input parameter that is varied when fitting or generating the reference spectra.

In one aspect, a method of controlling a polishing operation includes storing an optical model for a layer stack having a plurality of layers. The optical model has a plurality of input parameters, the plurality of input parameters including a first parameter and a second parameter. The second parameter is a polarization angle or a relative contribution between two orthogonal polarizations. A spectrum reflected from the substrate is measured with an in-sequence or in-situ monitoring system to provide a measured spectrum. The optical model is fit to the measured spectrum. The fitting includes finding a first value of the first parameter and a second value of the second parameter that provides a minimum difference between an output spectrum of the optical model and the measured spectrum. The substrate is polished with the polishing apparatus, and a polishing endpoint or a polishing parameter of the polishing apparatus is adjusted based on the first value associated with the best matching reference spectrum.

In another aspect, a method of controlling a polishing operation includes storing an optical model for a layer stack having a plurality of layers. The optical model having a plurality of input parameters, the plurality of input parameters including a first parameter and a second parameter. The second parameter is a polarization angle or a relative contribution between two orthogonal polarizations. Data is stored defining a plurality of first values for the first parameter and a plurality of second values for the second parameter. For each combination of a first value from the plurality of first values and a second value from the plurality of second values, a reference spectrum is calculated using the optical model based on the first value and the second value, to generate a plurality of reference spectra. A spectrum reflected from the substrate is measured with an in-sequence or in-situ monitoring system to provide a measured spectrum. A best matching reference spectrum from the plurality of reference spectra that provides a best match to the measured spectrum is determined. The first value associated with the best matching reference spectrum is determined. The substrate is polished with the polishing apparatus, and a polishing endpoint or a polishing parameter of the polishing apparatus is adjusted based on the first value associated with the best matching reference spectrum.

Implementations of either aspect may include one or more of the following features. Measuring the spectrum may be performed with the in-line monitoring system before or after the polishing of the substrate. Measuring the spectrum may be performed with an in-situ monitoring system during the polishing of the substrate. Measuring the spectrum may include directing polarized light onto the substrate. The second parameter may be a polarization angle. Measuring the spectrum may include directing ostensibly unpolarized light onto the substrate. The second parameter may be the relative contribution between two orthogonal polarizations. The first parameter may be a thickness of an outermost layer.

In another aspect, a non-transitory computer program product, tangibly embodied in a machine readable storage device, includes instructions to carry out the method.

Certain implementations may include one or more of the following advantages. An optical model may account for either unintended polarization effects or the orientation of the substrate By looking at different polarization angles, additional information may be obtained. For example, by looking at TE and TM polarized light, two spectra (as opposed to one spectrum in the case of unpolarized light) that carry different information can be obtained. The system now has two spectra to which the model can be fit, thus increasing fit confidence. Also, there may be more (or better information) contained in a particular polarization angle. For example, if one polarization state contains good signal information and the other polarization state does not, it is possible to rely only on the former for parameter information. Reliability of the endpoint system to detect a desired polishing endpoint may be improved, and within-wafer and wafer-to-wafer thickness non-uniformity (WTWNU and WTWNU) may be reduced.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a schematic cross-sectional view of an example of a polishing station.

FIG. 2 illustrates a schematic top view of a substrate having multiple zones.

FIG. 3 illustrates a top view of a polishing pad and shows locations where in-situ measurements are taken on a substrate.

FIG. 4 illustrates a schematic cross-sectional view of an example of an in-line monitoring station.

FIG. 5 illustrates a measured spectrum from the in-situ optical monitoring system.

FIG. 6 illustrates a model of a portion of the substrate using a 1-dimensional model of layers of the stack.

FIG. 7 illustrates a model of a portion of the substrate using a 2-dimensional model of layers of the stack.

FIG. 8 illustrates an index trace.

FIG. 9 illustrates an index trace having a linear function fit to index values collected after clearance of an overlying layer is detected.

FIG. 10 is a flow diagram of an example process for controlling a polishing operation.

Like reference numbers and designations in the various drawings indicate like elements.

DETAILED DESCRIPTION

One optical monitoring technique for controlling a polishing operation is to measure a spectrum of light reflected from a substrate, either in-situ during polishing or at an in-line metrology station, and fit a function, e.g., an optical model, to the measured spectra. Another technique is to compare the measured spectrum to a plurality of reference spectra from a library, and identify a matching reference spectrum.

As noted above, even if the monitoring system is ostensibly designed to illuminate the substrate with unpolarized light, some polarization effects can occur. The optical model can include the polarization angle as an input parameter. This permits the polarization angle to be varied when fitting the function, or when generating reference spectra for the library. Thus, the polarization may be unintentional (in which case the light will likely be partially polarized), or intentional (in which case the light will likely be completely polarized).

A substrate can include a first layer (that will undergo polishing) and a second layer disposed under the first layer. Both the first layer and the second layer are at least semi-transparent. Together, the second layer and one or more additional layers (if present) provide a layer stack below the first layer. Examples of layers include an insulator, passivation, etch stop, barrier layer and capping layers. Examples of materials in such layers include oxide, such as silicon dioxide, a low-k material, such as carbon doped silicon dioxide, e.g., Black Diamond™ (from Applied Materials, Inc.) or Coral™ (from Novellus Systems, Inc.), silicon nitride, silicon carbide, carbon-silicon nitride (SiCN), a metal nitride, e.g., tantalum nitride or titanium nitride, or a material formed from tetraethyl orthosilicate (TEOS).

Chemical mechanical polishing can be used to planarize the substrate until a predetermined thickness of the first layer is removed, a predetermined thickness of the first layer remains, or until the second layer is exposed.

FIG. 1 illustrates an example of a polishing apparatus 100. The polishing apparatus 100 includes a rotatable disk-shaped platen 120 on which a polishing pad 110 is situated. The platen is operable to rotate about an axis 125. For example, a motor 121 can turn a drive shaft 124 to rotate the platen 120. The polishing pad 110 can be a two-layer polishing pad with an outer polishing layer 112 and a softer backing layer 114.

The polishing apparatus 100 can include a port 130 to dispense polishing liquid 132, such as a slurry, onto the polishing pad 110 to the pad. The polishing apparatus can also include a polishing pad conditioner to abrade the polishing pad 110 to maintain the polishing pad 110 in a consistent abrasive state.

The polishing apparatus 100 includes one or more carrier heads 140. Each carrier head 140 is operable to hold a substrate 10 against the polishing pad 110. Each carrier head 140 can have independent control of the polishing parameters, for example pressure, associated with each respective substrate.

In particular, each carrier head 140 can include a retaining ring 142 to retain the substrate 10 below a flexible membrane 144. Each carrier head 140 also includes a plurality of independently controllable pressurizable chambers defined by the membrane, e.g., three chambers 146 a-146 c, which can apply independently controllable pressurizes to associated zones 148 a-148 c on the flexible membrane 144 and thus on the substrate 10 (see FIG. 3). Referring to FIG. 3, the center zone 148 a can be substantially circular, and the remaining zones 148 b-148 c can be concentric annular zones around the center zone 148 a. Although only three chambers are illustrated in FIGS. 1 and 2 for ease of illustration, there could be one or two chambers, or four or more chambers, e.g., five chambers.

Returning to FIG. 1, each carrier head 140 is suspended from a support structure 150, e.g., a carousel or track, and is connected by a drive shaft 152 to a carrier head rotation motor 154 so that the carrier head can rotate about an axis 155. Optionally each carrier head 140 can oscillate laterally, e.g., on sliders on the carousel 150; by rotational oscillation of the carousel itself, or by motion of a carriage 158 (see FIG. 4) on along the track. In operation, the platen is rotated about its central axis 125, and each carrier head is rotated about its central axis 155 and translated laterally across the top surface of the polishing pad.

While only one carrier head 140 is shown, more carrier heads can be provided to hold additional substrates so that the surface area of polishing pad 110 may be used efficiently. Thus, the number of carrier head assemblies adapted to hold substrates for a simultaneous polishing process can be based, at least in part, on the surface area of the polishing pad 110.

In some implementations, the polishing apparatus includes an in-situ optical monitoring system 160, e.g., a spectrographic monitoring system, which can be used to determine whether to adjust a polishing rate or an adjustment for the polishing rate as discussed below. An optical access through the polishing pad is provided by including an aperture (i.e., a hole that runs through the pad) or a solid window 118. The solid window 118 can be secured to the polishing pad 110, e.g., as a plug that fills an aperture in the polishing pad, e.g., is molded to or adhesively secured to the polishing pad, although in some implementations the solid window can be supported on the platen 120 and project into an aperture in the polishing pad.

In some implementation, illustrated in FIG. 4, the polishing apparatus of includes an in-sequence optical monitoring system 160 having a probe 180 positioned between two polishing stations, between a polishing station and a transfer station, or in the transfer station. The probe 180 of the in-sequence monitoring system 160 can be supported on a platform 106, and can be positioned on the path of the carrier head.

The probe 180 can include a mechanism to adjust its vertical height relative to the top surface of the platform 106. In some implementations, the probe 180 is supported on an actuator system 182 that is configured to move the probe 180 laterally in a plane parallel to the plane of the track 128. The actuator system 182 can be an XY actuator system that includes two independent linear actuators to move probe 180 independently along two orthogonal axes.

Referring to FIGS. 1 and 4, in either the in-situ or in-sequence embodiments, the optical monitoring system 160 can include a light source 162, a light detector 164, and circuitry 166 for sending and receiving signals between a remote controller 190, e.g., a computer, and the light source 162 and light detector 164. One or more optical fibers can be used to transmit the light from the light source 162 to the optical access in the polishing pad, and to transmit light reflected from the substrate 10 to the detector 164. For example, a bifurcated optical fiber 170 can be used to transmit the light from the light source 162 to the substrate 10 and back to the detector 164. The bifurcated optical fiber an include a trunk 172 positioned in proximity to the optical access, and two branches 174 and 176 connected to the light source 162 and detector 164, respectively. The probe 180 can include the trunk end of the bifurcated optical fiber.

The light source 162 can be operable to emit white light. In one implementation, the white light emitted includes light having wavelengths of 200-800 nanometers. In some implementations, the light source 162 generates unpolarized light. A suitable light source is a xenon lamp or a xenon mercury lamp.

In some implementations, the light source 162 and optical components between the light source 162 and substrate 10 are ostensibly configured to direct unpolarized light onto the substrate 10, e.g., there are no polarization filters or the like in the path of the light between the light source 162 and the substrate 10. Nevertheless, defects in the optical components may cause the light beam that reaches the substrate to be partially polarized. In other implementations, optical components between the light source 162 and substrate 10 are configured to deliberated direct polarized light onto the substrate, e.g., a polarization filter 178 (illustrated in FIG. 4, although it can be used in the in-situ system of FIG. 2) can be positioned between the light source 162 and the substrate 10.

The light detector 164 can be a spectrometer. A spectrometer is an optical instrument for measuring intensity of light over a portion of the electromagnetic spectrum. A suitable spectrometer is a grating spectrometer. Typical output for a spectrometer is the intensity of the light as a function of wavelength (or frequency). FIG. 4 illustrates an example of a measured spectrum 300.

In some in-situ implementations, the top surface of the platen can include a recess 128 into which is fit an optical head 168 that holds one end of the trunk 172 of the bifurcated fiber. The optical head 168 can include a mechanism to adjust the vertical distance between the top of the trunk 172 and the solid window 118.

The output of the circuitry 166 can be a digital electronic signal that passes through a rotary coupler 129, e.g., a slip ring, in the drive shaft 124 to the controller 190 for the optical monitoring system. Similarly, the light source can be turned on or off in response to control commands in digital electronic signals that pass from the controller 190 through the rotary coupler 129 to the optical monitoring system 160. Alternatively, the circuitry 166 could communicate with the controller 190 by a wireless signal.

As noted above, the light source 162 and light detector 164 can be connected to a computing device, e.g., the controller 190, operable to control their operation and receive their signals. The computing device can include a microprocessor situated near the polishing apparatus, e.g., a programmable computer. With respect to control, the computing device can, for example, synchronize activation of the light source with the rotation of the platen 120.

In some in-situ implementations, the light source 162 and detector 164 of the in-situ monitoring system 160 are installed in and rotate with the platen 120. In this case, the motion of the platen will cause the sensor to scan across each substrate. In particular, as the platen 120 rotates, the controller 190 can cause the light source 162 to emit a series of flashes starting just before and ending just after the optical access passes below the substrate 10. Alternatively, the computing device can cause the light source 162 to emit light continuously starting just before and ending just after each substrate 10 passes over the optical access. In either case, the signal from the detector can be integrated over a sampling period to generate spectra measurements at a sampling frequency.

In operation, the controller 190 can receive, for example, a signal that carries information describing a spectrum of the light received by the light detector for a particular flash of the light source or time frame of the detector.

As shown by in FIG. 3, if the detector is installed in the platen, due to the rotation of the platen (shown by arrow 204), as the window 108 travels below a carrier head, the optical monitoring system making spectra measurements at a sampling frequency will cause the spectra measurements to be taken at locations 201 in an arc that traverses the substrate 10. For example, each of points 201 a-201 k represents a location of a spectrum measurement by the monitoring system (the number of points is illustrative; more or fewer measurements can be taken than illustrated, depending on the sampling frequency). The sampling frequency can be selected so that between five and twenty spectra are collected per sweep of the window 108. For example, the sampling period can be between 3 and 100 milliseconds.

As shown, over one rotation of the platen, spectra are obtained from different radii on the substrate 10. That is, some spectra are obtained from locations closer to the center of the substrate 10 and some are closer to the edge. Thus, for any given scan of the optical monitoring system across a substrate, based on timing, motor encoder information, and optical detection of the edge of the substrate and/or retaining ring, the controller 190 can calculate the radial position (relative to the center of the substrate being scanned) for each measured spectrum from the scan. The polishing system can also include a rotary position sensor, e.g., a flange attached to an edge of the platen that will pass through a stationary optical interrupter, to provide additional data for determination of which substrate and the position on the substrate of the measured spectrum. The controller can thus associate the various measured spectra with the controllable zones 148 b-148 e (see FIG. 2) on the substrates 10 a and 10 b. In some implementations, the time of measurement of the spectrum can be used as a substitute for the exact calculation of the radial position.

Over multiple rotations of the platen, for each zone, a sequence of spectra can be obtained over time. Without being limited to any particular theory, the spectrum of light reflected from the substrate 10 evolves as polishing progresses (e.g., over multiple rotations of the platen, not during a single sweep across the substrate) due to changes in the thickness of the outermost layer, thus yielding a sequence of time-varying spectra. Moreover, particular spectra are exhibited by particular thicknesses of the layer stack.

If the probe of the in-line monitoring system is used, the probe can moved relative to the substrate due to rotation of the carrier head, lateral motion of the carrier head, or lateral motion of the probe, to measure multiple spectra situation along a path on the substrate.

For either in-situ or in-line monitoring, the controller, e.g., the computing device, can be programmed to fit a function, e.g., an optical model, to the measured spectrum. The function has multiple input parameters, and generates an output spectrum calculated from the input parameters. The input parameters include at least a parameter that can be used to control the polishing endpoint or which can be used to adjust a polishing process, e.g., the thickness of the first layer, However, the parameter from which the polishing endpoint can readily be determined could also be a thickness removed, or more generic representation of the progress of the substrate through the polishing process, e.g., an index value representing the time or number of platen rotations at which the spectrum would be expected to be observed in a polishing process that follows a predetermined progress.

In some in-situ implementations, the function is fit to each spectra in the sequence, thereby generating a sequence of fitted parameter values, e.g., a sequence of fitted thickness values.

The optical model at least partially accounts for diffraction effects generated by a repeating feature on the substrate. At least one of the input parameters represents a characteristic of the repeating feature. As shown in FIG. 6, the repeating feature can be represented with a 1-dimensional model (e.g. repeating lines and spaces). In this case, the diffracted light resulting from the repeating feature can be optically modeled with a “1-D” diffraction grating, and the input parameter can be a line width or a line pitch. This model may be appropriate for regions of the substrate having multiple parallel conductive traces.

Alternatively, referring to FIG. 7, the repeating feature can be represented with a 2-dimensional model (e.g. repeating shapes). In this case, the diffracted light resulting from the repeating feature can be optically modeled with a “2-D” diffraction grating, and the input parameter can be the feature dimension and/or the feature pitch in either or both dimensions. This model may be appropriate for regions of the substrate with repeating cells, e.g., DRAM structures. The 2-D model includes a unit cell 300 that includes a portion 310 of one material (with first optical characteristics) and a portion 320 of a different material (with different optical characteristics). Although FIG. 7 illustrates a simple 2-D parallelepiped volume of different material than the surrounding, the repeating feature can be more complex and can include multiple sub-features.

Other input parameters of the optical model can include the thickness, index of refraction and/or extinction coefficient of each of the layers.

An additional input parameter of the optical model is either the polarization, or a relative weighting between two orthogonal polarizations.

The diffraction effects can be calculated using rigorous coupled waveform analysis. In particular, rigorous coupled waveform analysis (RCWA) can be used to model and calculate the diffraction effects. RCWA equations can be used to generate a reflectance R for each wavelength, and then to determine a diffraction efficiency at each wavelength.

Details of RCWA are laid out “Formulation for stable and efficient implementation of the rigorous coupled-wave analysis of binary gratings” by Moharam et. al, and “Stable implementation of the rigorous coupled-wave analysis for surface-relief gratings: enhanced transmittance matrix approach” by Moharam et. al., each of which is incorporated by reference.

For example, for optically modeling of a “1-D” diffraction grating, equations 24-26 from “Stable implementation of the rigorous coupled-wave analysis for surface-relief gratings: enhanced transmittance matrix approach” can be used to generate R for each wavelength, and the diffraction efficiency can be determined at each wavelength via equations 25 and 45 from “Formulation for stable and efficient implementation of the rigorous coupled-wave analysis of binary gratings.”

The diffraction efficiency is normalized to the diffraction efficiency of blanket silicon to match the reflectance spectra of the in-situ monitoring system, which also is normalized to silicon to get rid of lamp, pad, and process effects. The silicon-normalized diffraction efficiency is then compared to the measured spectra.

Modeling diffracted light for a 2-D structure is more complicated, but similar in technique, extrapolated from a 1-D line to a 2-D plane.

The method described above is not the only way and not necessarily the fastest or most accurate way to determine the diffraction efficiency of a 1-D or 2-D structure. There are alternative techniques, e.g., described in “Multilayer modal method for diffraction gratings of arbitrary profile, depth, and permittivity” by Lifeng Li. But in these various techniques, the model includes diffraction caused by the repeating structure.

For at least two of the parameters, parameters values are calculated that provide a minimum difference between an output spectrum of the optical model and the measured spectrum. A first of the at least two parameters includes the parameter from which the polishing endpoint can readily be determined, e.g., the thickness of the first layer. A second of the at least two parameters can be an input parameter that represents a dimensional characteristic of the repeating feature. For example, the second of the at least two parameters can be a linewidth of the repeating feature. Other possibilities for the second of the at least two parameters include the line pitch, the area density of a material of the feature (e.g. how much of the area of the device being modeled is consumed by a given material), or the vertical shape and depth of structures (e.g. is a copper line best modeled as square, or is it tapered with depth).

In an example, to account for an array of traces on the substrate, the input parameters include the angle of incidence of the light, (e.g. zero degrees), the pitch of the traces, the number of layers modeled, the thickness of each layer, the linewidth of the traces, the n and k values of the input and output planes, the n and k values of the feature(s) and the region(s) outside the feature(s) (e.g., the ridge and groove) for each layer, and the wavelength range analyzed. The values for the thickness of the outermost layer and the linewidth that provide the minimum difference between an output spectrum of the optical model and the measured spectrum is determined.

Some of the input parameters can have fixed values. Some of the input parameters can be permitted to vary; these are the parameters for which values will be determined as part of the fitting process. Those input parameters for which values are determined as part of the fitting can be limited to variation between predetermined ranges. The ranges for the input parameters can be chosen to 1) avoid degenerative fits, and 2) to keep calculation time at a reasonable level. If the allowed range for a value of an input parameter is too great, the likelihood a degenerative fit increases. A user can input into the model the nominal parameter values for some of the parameters (e.g. line width, expected thickness, and index of refraction and extinction coefficient for various materials). The user can also input into the model the permitted ranges for some of the parameter values. These nominal values and ranges can be based on the user's knowledge of the device/layer being polished.

As noted above, some boundary conditions can be imposed on the parameters. For example, the thickness t for a layer j can be permitted to vary between a minimum value T_(MINj) and a maximum value T_(MAXj). Similar boundary conditions can be imposed on the parameters that are material properties, e.g., index of refraction (n), extinction coefficient (k), and/or on parameters that are structural properties, e.g., the line width. The boundary values can be input by the operator based on knowledge of variation within the fabrication process.

In some implementations, the input parameters are fed directly into equations of the optical model. However, in some implementations the input parameters can be used to generate a plurality of pixel grids. Each layer of the device that has a different 2-D pattern is modeled with its own pixel grid, so that the 3-D device is represented by a stack of pixel grids. Each pixel grid in the stack can be assigned its own thickness. The grid is a user-defined size in the x and y directions, and the scale of the pixels can also be user-defined. Each pixel in a grid is assigned a refractive index and an extinction coefficient based on the material in the pixel. The diffraction is then calculated based on the array of pixels. By combining the a series of grid slices, one can model any device in 3 dimensions.

For example, to model a region of repeating lines, the input parameters could include the linewidth and pitch of the lines, and the material composition of the lines and the material composition of the region between the lines. A pixel array would then be generated; a determination of whether the pixel is part of the line or part of the region between the lines is made based on the linewidth and pitch. If the pixel is part of the line, then it would be assigned index of refraction and extinction coefficient values for the material composition of the line. If the pixel is not part of the line, then it would be assigned index of refraction and extinction coefficient values for the material composition of the region between the lines.

In some implementations, the optical model models the presence of a metal line. However, a metal liner material, e.g., Tantalum, can be used to model the metal contribution instead of the material of the metal line, e.g., copper. Although it may be possible to completely model both the liner and the copper which lies below or next to the liner, this may be too complicated or computationally intensive; the model can be simplified and computation time reduced if the liner material only is used.

Some in-line monitoring systems illuminate a substrate with polarized light beams at multiple different angles of incidence, although unpolarized light can also be used. Some in-situ monitoring systems illuminate the substrate with unpolarized light, although polarized light can also be used. In addition, the unpolarized light can be at a single angle of incidence.

In some implementations, which can fit well with intentional use of polarized light, a polarization angle is used as an input parameter of the optical model. That is, the polarization angle is simply treated as a parameter to vary when optimizing the fit between the model and the measured spectrum.

In some implementations, which can fit well with unintentional use of polarized light, calculation of the output spectrum can include calculation of a first spectrum for a first polarization of light and calculation of a second spectrum for a second polarization of light. In this case, the relative contribution between the first polarization and the second polarization is used as an input parameter of the optical model. The relative contribution is treated as a parameter to vary when optimizing the fit between the model and the measured substrate.

For example, the calculation of the model M can be represented as

M=X*S ₁+(1−X)*S ₂

where S₁ the first spectrum generated using the first polarization of light, S₂ is the second spectrum generated using the second polarization of light, and X is the relative contribution.

The calculation of the first spectrum and the second spectrum can otherwise be conducted with identical values for the input parameters. The first polarization can be s-polarization and the second polarization can be p-polarization.

In some implementations, the optical model can include multiple optical sub-models. Each optical sub-model operates as the optical model described above, e.g., with various input parameters, but the different sub-model represent regions of different patterning on the substrate. Since the patterning is different, the effect of diffraction will be different, and the resulting spectrum will be different. Each sub-model can generate an intermediate spectrum, and the intermediate spectra can combined to generate the output spectrum. The relative weight, e.g., percentage contribution, of each intermediate spectrum can be one of the parameters that is calculated as part of the fitting process.

This permits the optical model to account for the possibility that the light beam will illuminate regions with different patterns on the substrate. Thus the model can provide one output spectrum that would be generated if the light happened to be collected from two structures simultaneously, e.g. if the light spot rested halfway on one structure and halfway on a different structure. For example, if the light spot was halfway on a 1-D grating that had a pitch A and the other half of the light spot was halfway on a structure of pitch B, then the proper model for such a reflectance spectrum would be one that was a combination of each with equal weighting for both.

In fitting the optical model to the measured spectrum, the parameters are selected to provide an output spectrum that is a close match to the measured spectrum. A close match can be considered to be the calculation of a minimum difference between the output spectrum and the measured spectrum, given the available computational power and time constraints. The thickness of the layer being polished can then be determined from the thickness parameter.

Calculation of a difference between the output spectrum and the measured spectrum can be a sum of absolute differences between the measured spectrum and the output spectrum across the spectra, or a sum of squared differences between the measured spectrum and the reference spectrum. Other techniques for calculating the difference are possible, e.g., a cross-correlation between the measured spectrum and the output spectrum can be calculated.

Fitting the parameters to find the closest output spectrum can be considered an example of finding a global minima of a function (the difference between the measured spectrum and the output spectrum generated by the function) in a multidimensional parameter space (with the parameters being the variable values in the function). For example, where the function is an optical model, the parameters can include the thickness, the index of refraction (n) and extinction coefficient (k) of the layers.

Regression techniques can be used to optimize the parameters to find a local minimum in the function. Examples of regression techniques include Levenberg-Marquardt (L-M)—which utilizes a combination of Gradient Descent and Gauss-Newton; Fminunc( )—a matlab function; lsqnonlin( )—matlab function that uses the L-M algorithm; and simulated annealing. In addition, non-regression techniques, such as the simplex method, can be used to optimize the parameters.

Certain parameter values may be thrown out based on the polarization angle. For example, it may be known that s or p polarization (0 or 90) does not contain useful information as the device size is shrunk.

A potential problem with using regression or non-regression techniques alone to fine a minimum is that there may be multiple local minima in the function. If regression is commenced near the a local minima that is not the global minima, then the wrong solution may be determined as regression techniques will only go “downhill” to the best solution. However, if multiple local minima are identified, regression could be performed on all of these minima and the best solution would be identified by the one with the least difference. An alternative approach would be to track all solutions from all local minima over a period of time, and determine which is the best one over time. Examples of techniques to identify global minima include genetic algorithms; multi-start (running the regression techniques from multiple starting points with parallel computing); global search—a Matlab function; and pattern searching.

The output of fitting process is a set of fitted parameters, including at least the parameters which the polishing endpoint can readily be determined, e.g., the thickness parameter of the layer being polished. However, as noted above, the fitted parameter could also be an index value representing the time or number of platen rotations at which the spectrum would be expected to be observed in a polishing process that follows a predetermined progress.

Rather than thickness, some other metric can be calculated using one or more the parameters that represent dimensions of the structure in the layer being polished. For example, the line width can be one of the parameters that is fitted, i.e., the line width is permitted to vary in the fitting process. Since the fitting is performed for each measured spectrum, this generates a sequence of parameter values that represent dimensions of the structure, e.g., a sequence of line width values.

In some implementations, for each measured spectrum, a metal line resistivity value Rs is calculated, e.g., by multiplying the layer thickness value by the line width value. This generates a sequence of metal line resistivity values. The endpoint can be determined from the sequence of metal line resistivity values.

Now referring to FIG. 8, which illustrates the results for only a single zone of a single substrate, the sequence of fitted endpoint parameter values, e.g., thickness values or resistance values, generated by fitting the optical model function to the sequence of measured spectra generates a time-varying sequence of values 212. This sequence of values 212 can be termed a trace 210. In general, the trace 210 can include one, e.g., exactly one, value per sweep of the optical monitoring system below the substrate.

As shown in FIG. 9, optionally a function, e.g., a polynomial function of known order, e.g., a first-order function (e.g., a line 214) is fit to the sequence of values derived from the measured spectra. The function can be fit using robust line fitting. Other functions can be used, e.g., polynomial functions of second-order, but a line provides ease of computation.

Optionally, the function can be fit to the values collected after time TC. Values for spectra collected before the time TC can ignored when fitting the function to the sequence of values. This can assist in elimination of noise in the measured spectra that can occur early in the polishing process, or it can remove spectra measured during polishing of another layer.

Polishing can be halted at an endpoint time TE that the line 214 crosses a target value TT. Alternatively, polishing can be halted simply at the time that the sequence of values cross the target value, e.g., without fitting any function to the sequence.

FIG. 10 shows a flow chart of a method 700 of polishing a product substrate. The product substrate can have at least the same layer structure as what is represented in the optical model.

The product substrate is polished (step 702), and a sequence of measured spectra are obtained during polishing (step 704), e.g., using the in-situ monitoring system described above. Alternatively, the product substrate can be transported to an in-line monitoring system, before or after polishing, and one or more spectra can be measured.

An optical model for a layer stack having a plurality of layers is stored. The optical model has a plurality of input parameters including a first parameter and a second parameter. The second parameter is a polarization angle or a relative contribution between two orthogonal polarizations. The first parameter can be a layer thickness, index of refraction, or extinction coefficient of the layer. In addition, data is stored defining a plurality of first values for the first parameter and a plurality of second values for the second parameter.

The optical model is fit to the measured spectrum (step 706), i.e., the input parameters are varied to find an output spectrum of the optical model that provides a minimum difference between the output spectrum and the measured spectrum. The fitting including finding a first value of the first parameter and a second value of the second parameter that provides the minimum. Since the second parameter is a polarization angle or a relative contribution between two orthogonal polarizations, the fitting including finding a value for the polarization angle or a relative contribution between two orthogonal polarizations that provides a minimum difference between an output spectrum of the optical model and the measured spectrum. Optionally, fitting the parameters to the measured spectrum includes calculating the output spectrum using diffraction effects of the repeating structure. The thickness or other characteristic value is provided by the fitted first value of the first parameter (step 708).

Alternatively, software can be used to automatically calculate multiple reference spectra using the optical model described above in order to generate a library of reference spectra (step 701). This can occur prior to polishing of the substrate. For each combination of a first value from the plurality of first values and a second value from the plurality of second values, calculating a reference spectrum using the optical model based on the first value and the second value, to generate a plurality of reference spectra.

The best-matching reference spectrum out of the plurality of reference spectra is determined (step 706 a). The thickness or other characteristic value is provided by the first value of the first parameter that was used to generate the best-matching reference spectrum (step 708 a).

Where a sequence of spectra is collected over time during polishing, a function, e.g., a linear function, is fit to the sequence of values for the measured spectra (step 710). Polishing can be halted once the endpoint value (e.g., a calculated parameter value, e.g., a thickness value, generated from the linear function fit to the sequence of parameter values) reaches a target value (step 712).

Although the discussion above assumes a rotating platen with an optical endpoint monitor installed in the platen, system could be applicable to other types of relative motion between the monitoring system and the substrate. For example, in some implementations, e.g., orbital motion, the light source traverses different positions on the substrate, but does not cross the edge of the substrate. In such cases, the collected spectra can still be grouped, e.g., spectra can be collected at a certain frequency and spectra collected within a time period can be considered part of a group. The time period should be sufficiently long that five to twenty spectra are collected for each group.

As noted above, the optical model has a plurality of input parameters. At one input parameter is allowed to vary over a predetermined range at a predetermined increment. In some implementations, a reference spectrum is calculated for each combination of values for at least two input parameters that are allowed to vary. For each parameter that is allowed to vary, the manufacturer can input data indicating a range (e.g., a maximum and a minimum value) and an increment between values within the range.

The thickness of the overlying layer can be one of the at least two input parameters. If the reference spectra are to be used during polishing, then the thickness can be allowed to vary over a large range, e.g., from about the expected starting thickness to the expected ending thickness. For example, the thickness might vary from 0 to 3000 Angstroms in 10 Angstrom increments. On the other hand, if the reference spectra are to be used at an in-sequence metrology station, then the thickness can be allowed to vary over a narrower range centered around the expected starting or ending thickness. For example, the thickness might vary from 2700 to 2900 Angstroms in 10 Angstrom increments.

Since there are variations in the thicknesses of the underlying layers of the incoming substrates, the thickness of at least one of the underlying layers can be another of the two or more parameters. The manufacturer can input a thickness range and a thickness increment for the at least one of the underlying layers, e.g., for multiple underlying layers. For example, the thickness of the underlying layer might vary from 2700 to 2900 Angstroms in 10 Angstrom increments.

In addition to variations of the layer thicknesses, the optical model can include variations in the index of refraction and/or the extinction coefficient of one or more layers in the optical stack. Thus, the index of refraction and/or the extinction coefficient of one or more layers in the optical stack can be another of the two or more parameters. The manufacturer can input a refractive index range and a refractive index increment for at least one of the underlying layers, e.g., for multiple underlying layers. The manufacturer can input a extinction coefficient range and an extinction coefficient increment for at least one of the underlying layers, e.g., for multiple underlying layers. For example, the user might choose to vary the index of refraction by modeling the dispersion coefficients with a Cauchy equation, and varying the A and B coefficients of the Caucy equation. For example, the user might vary the index of refraction parameter A between 1.40 and 1.45 at 0.01 increments. The user can similarly model extinction coefficients with the equation k=A+exp(B−12400*(1/lambda−1/C)) with lambda in Angstroms. A might range from 0.003 to 0.006 with increment 0.001, and B might vary from 0.45 to 0.55 with increment 0.01.

The one or more layers can include the underlying layer and/or the overlying layer. The one or more layers may include a layer of silicon oxide, carbon-doped silicon oxide, silicon carbide, silicon nitride, carbon-doped silicon nitride and/or polysilicon. Depending on the composition and deposition method for the layers on the substrate, some spectral measurements may be made from substrates with a layer having a higher index of refraction or extinction coefficient, whereas other spectral measurements may be made from substrates with a layer having a lower index of refraction or extinction coefficient.

As noted above, the optical model can include variations in the polarization angle. In some implementations, the polarization angle can be another of the two or more parameters. The manufacturer can input a polarization range and a polarization increment. For example, the polarization range might vary from 0° (s-polarization) to 90° (p-polarization) in 5° increments.

Alternatively, the optical model can include relative contribution of two polarization angles. Thus, wherein the reference spectrum generated from the optical model is represented by M=X*S_(i)+(1−X)*S₂, the relative contribution X can be another of the two or more parameters. The manufacturer can input a contribution range and a contribution increment. For example, the polarization range might vary from 0 to 1 in 0.1 increments.

In a first example, polarized light is directed onto the substrate at the in-line monitoring station to measure a spectrum. The controller fits the optical model to the measured spectrum, and the polarization angle is treated as a parameter to vary when optimizing the fit between the model and the measured spectrum.

In a second example, ostensibly unpolarized light is directed onto the substrate by the in-situ monitoring system. The controller fits the optical model to the measured spectrum, and the relative contribution between two orthogonal polarizations is treated as a parameter to vary when optimizing the fit between the model and the measured substrate.

In a third example, applicable to either an in-situ or in-line monitoring system, a plurality of reference spectra are generated from the optical model using a plurality of different values for the relative contribution between two orthogonal polarizations. Ostensibly unpolarized light is directed onto the substrate to measure a spectrum. The controller searches for the reference spectrum of the plurality of reference spectra that provides the best match to the measured spectrum.

In a fourth example, applicable to either an in-situ or in-line monitoring system, a plurality of reference spectra are generated from the optical model using a plurality of different values for polarization. Polarized light is directed onto the substrate to measure a spectrum. The controller searches for the best-matching reference spectrum of the plurality of reference spectra.

In the examples above, polarized light is used when the optical model uses polarization as an input parameter, but in other implementations ostensibly unpolarized light is used. Similarly, in the examples above, unpolarized light is used when the optical model uses relative contribution of two orthogonal polarization as an input parameter, but in other implementations polarized light is used.

Without being limited to any particular theory, by using polarization or relative contribution of orthogonal polarizations as an input parameter to the model when generating reference spectra or fitting the model, errors due to uncertainty in the orientation of the chip structure and/or partial polarization of ostensibly unpolarized light may be reduced.

As used in the instant specification, the term substrate can include, for example, a product substrate (e.g., which includes multiple memory or processor dies), a test substrate, a bare substrate, and a gating substrate. The substrate can be at various stages of integrated circuit fabrication, e.g., the substrate can be a bare wafer, or it can include one or more deposited and/or patterned layers. The term substrate can include circular disks and rectangular sheets.

Embodiments of the invention and all of the functional operations described in this specification can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structural means disclosed in this specification and structural equivalents thereof, or in combinations of them. Embodiments of the invention can be implemented as one or more computer program products, i.e., one or more computer programs tangibly embodied in a non-transitory machine readable storage media, for execution by, or to control the operation of, data processing apparatus, e.g., a programmable processor, a computer, or multiple processors or computers.

The above described polishing apparatus and methods can be applied in a variety of polishing systems. Either the polishing pad, or the carrier heads, or both can move to provide relative motion between the polishing surface and the substrate. For example, the platen may orbit rather than rotate. The polishing pad can be a circular (or some other shape) pad secured to the platen. Some aspects of the endpoint detection system may be applicable to linear polishing systems, e.g., where the polishing pad is a continuous or a reel-to-reel belt that moves linearly. The polishing layer can be a standard (for example, polyurethane with or without fillers) polishing material, a soft material, or a fixed-abrasive material. Terms of relative positioning are used; it should be understood that the polishing surface and substrate can be held in a vertical orientation or some other orientation.

Although the description above has focused on control of a chemical mechanical polishing system, the in-sequence metrology station can be applicable to other types of substrate processing systems, e.g., etching or deposition systems.

Particular embodiments of the invention have been described. Other embodiments are within the scope of the following claims. 

What is claimed is:
 1. A method of controlling a polishing operation, comprising: storing an optical model for a layer stack having a plurality of layers, the optical model having a plurality of input parameters, the plurality of input parameters including a first parameter and a second parameter, wherein the second parameter is a polarization angle or a relative contribution between two orthogonal polarizations; measuring a spectrum reflected from the substrate with an in-sequence or in-situ monitoring system to provide a measured spectrum; fitting the optical model to the measured spectrum, the fitting including finding a first value of the first parameter and a second value of the second parameter that provides a minimum difference between an output spectrum of the optical model and the measured spectrum; polishing the substrate with the polishing apparatus; and adjusting a polishing endpoint or a polishing parameter of the polishing apparatus based on the first value associated with the best matching reference spectrum.
 2. The method of claim 1, wherein measuring the spectrum is performed with the in-line monitoring system before or after the polishing of the substrate.
 3. The method of claim 1, wherein measuring the spectrum is performed with an in-situ monitoring system during the polishing of the substrate.
 4. The method of claim 1, wherein measuring the spectrum comprises directing polarized light onto the substrate.
 5. The method of claim 4, wherein the second parameter comprises a polarization angle.
 6. The method of claim 1, wherein measuring the spectrum comprises directing ostensibly unpolarized light onto the substrate.
 7. The method of claim 6, wherein the second parameter comprises the relative contribution between two orthogonal polarizations.
 8. The method of claim 1, wherein the first parameter comprises a thickness of an outermost layer.
 9. A method of controlling a polishing operation, comprising: storing an optical model for a layer stack having a plurality of layers, the optical model having a plurality of input parameters, the plurality of input parameters including a first parameter and a second parameter, wherein the second parameter is a polarization angle or a relative contribution between two orthogonal polarizations; storing data defining a plurality of first values for the first parameter and a plurality of second values for the second parameter; for each combination of a first value from the plurality of first values and a second value from the plurality of second values, calculating a reference spectrum using the optical model based on the first value and the second value, to generate a plurality of reference spectra; measuring a spectrum reflected from the substrate with an in-sequence or in-situ monitoring system to provide a measured spectrum; determining a best matching reference spectrum from the plurality of reference spectra that provides a best match to the measured spectrum; determining the first value associated with the best matching reference spectrum; polishing the substrate with the polishing apparatus; and adjusting a polishing endpoint or a polishing parameter of the polishing apparatus based on the first value associated with the best matching reference spectrum.
 10. The method of claim 9, wherein measuring the spectrum is performed with the in-line monitoring system before or after the polishing of the substrate.
 11. The method of claim 9, wherein measuring the spectrum is performed with an in-situ monitoring system during the polishing of the substrate.
 12. The method of claim 9, wherein measuring the spectrum comprises directing polarized light onto the substrate.
 13. The method of claim 12, wherein the second parameter comprises a polarization angle.
 14. The method of claim 9, wherein measuring the spectrum comprises directing ostensibly unpolarized light onto the substrate.
 15. The method of claim 14, wherein the second parameter comprises the relative contribution between two orthogonal polarizations.
 16. The method of claim 9, wherein the first parameter comprises a thickness of an outermost layer. 